Search CORE

45 research outputs found

Efficiency analysis methodology of FPGAs based on lost frequencies, area and cycles

Author: Braeken An
Cornelis Jan G.
da Silva Bruno
Lemeire Jan
Touhafi Abdellah
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

We propose a methodology to study and to quantify efficiency and the impact of overheads on runtime performance. Most work on High-Performance Computing (HPC) for FPGAs only studies runtime performance or cost, while we are interested in how far we are from peak performance and, more importantly, why. The efficiency of runtime performance is defined with respect to the ideal computational runtime in absence of inefficiencies. The analysis of the difference between actual and ideal runtime reveals the overheads and bottlenecks. A formal approach is proposed to decompose the efficiency into three components: frequency, area and cycles. After quantification of the efficiencies, a detailed analysis has to reveal the reasons for the lost frequencies, lost area and lost cycles. We propose a taxonomy of possible causes and practical methods to identify and quantify the overheads. The proposed methodology is applied on a number of use cases to illustrate the methodology. We show the interaction between the three components of efficiency and show how bottlenecks are revealed

Ghent University Academic Bibliography

A Lost Cycles Analysis for Performance Prediction using High-Level Synthesis

Author: Braeken An
da Silva Gomes Bruno
Lemeire Jan
Touhafi Abdellah
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Crossref

Ghent University Academic Bibliography

Performance and toolchain of a combined GPU/FPGA desktop

Author: Braeken An
Cornelis Jan G.
D'Hollander Erik
da Silva Gomes Bruno
Lemeire Jan
Touhafi Abdellah
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 01/01/2013
Field of study

Low-power, high-performance computing nowadays relies on accelerator cards to speed up the calculations. Combining the power of GPUs with the flexibility of FPGAs enlarges the scope of problems that can be accelerated. We describe the performance analysis of a desktop equipped with a GPU Tesla 2050 and an FPGA Virtex- 6 LX 240T. The balance between the I/O and the raw peak performance is analyzed using the roofline model. A well-tuned accelerator- based codesign, identifying the parallelism, the computation and data patterns of different classes of algorithms, will enable to maximize the performance of the combined GPU/FPGA system

Ghent University Academic Bibliography

Study of combining GPU/FPGA accelerators for high-performance computing

Author: Braeken An
Cornelis Jan G
D'Hollander Erik
da Silva Gomes Bruno
Lemeire Jan
Touhafi Abdellah
Publication venue: HiPEAC
Publication date: 01/01/2013
Field of study

This contribution presents the performance modeling of a super desktop with GPU and FPGA accelerators, using OpenCL for the GPU and a high-level synthesis compiler for the FPGAs. The performance model is used to evaluate the different high-level synthesis optimizations, taking into account the resource usage, and to compare the compute power of the FPGA with the GP

Ghent University Academic Bibliography

Efficient and Effective Learning of HMMs Based on Identification of Hidden States

Author: Jan Lemeire
Tingting Liu
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2017
Field of study

The predominant learning algorithm for Hidden Markov Models (HMMs) is local search heuristics, of which the Baum-Welch (BW) algorithm is mostly used. It is an iterative learning procedure starting with a predefined size of state spaces and randomly chosen initial parameters. However, wrongly chosen initial parameters may cause the risk of falling into a local optimum and a low convergence speed. To overcome these drawbacks, we propose to use a more suitable model initialization approach, a Segmentation-Clustering and Transient analysis (SCT) framework, to estimate the number of states and model parameters directly from the input data. Based on an analysis of the information flow through HMMs, we demystify the structure of models and show that high-impact states are directly identifiable from the properties of observation sequences. States having a high impact on the log-likelihood make HMMs highly specific. Experimental results show that even though the identification accuracy drops to 87.9% when random models are considered, the SCT method is around 50 to 260 times faster than the BW algorithm with 100% correct identification for highly specific models whose specificity is greater than 0.06

Crossref

Directory of Open Access Journals

A flexible numerical framework for engineering - a Response Surface Modelling application

Author: Aldinucci Marco
d&apos
Lemeire Jan
Viviani Paolo
Vucinic Dean
Publication venue: ACEX CONFERENCE
Publication date: 01/01/2016
Field of study

Institutional Research Information System University of Turin

A combined GPGPU-FPGA high-performance desktop

Author: Braeken An
Cornelis Jan
D'Hollander Erik
da Silva Gomes Bruno
Enescu Valentin
Lemeire Jan
Touhafi Abdellah
Publication venue
Publication date: 01/01/2012
Field of study

Computation of intensive interactive software applications on R&D desktops require a versatile hardware and software high-performance environment. Present-day solutions focus on one technology, e.g. GPUs, grids, multi-cores, clusters, … To leverage the power of different technologies, a hybrid solution is presented, combining the power of General-Purpose Graphical Processing units (GPGPUs) and Field Programmable Gate Arrays (FPGAs

Ghent University Academic Bibliography

Heterogeneous cloud computing : design methodology to combine hardware accelerators

Author: Braeken An
Cornelis Jan G.
D'Hollander Erik
da Silva Gomes Bruno
Lemeire Jan
Touhafi Abdellah
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Crossref

Ghent University Academic Bibliography

Hidden Semi-Markov Models for Predictive Maintenance

Author: Francesco Cartella
Hichem Sahli
Jan Lemeire
Luca Dimiccoli
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2015
Field of study

Realistic predictive maintenance approaches are essential for condition monitoring and predictive maintenance of industrial machines. In this work, we propose Hidden Semi-Markov Models (HSMMs) with (i) no constraints on the state duration density function and (ii) being applied to continuous or discrete observation. To deal with such a type of HSMM, we also propose modifications to the learning, inference, and prediction algorithms. Finally, automatic model selection has been made possible using the Akaike Information Criterion. This paper describes the theoretical formalization of the model as well as several experiments performed on simulated and real data with the aim of methodology validation. In all performed experiments, the model is able to correctly estimate the current state and to effectively predict the time to a predefined event with a low overall average absolute error. As a consequence, its applicability to real world settings can be beneficial, especially where in real time the Remaining Useful Lifetime (RUL) of the machine is calculated

Crossref

Directory of Open Access Journals

Learning causal models of multivariate systems: and the value of IT for the performance modeling of computer programs

Author: Lemeire Jan
Publication venue: 'American Society of Plant Biologists (ASPB)'
Publication date: 01/01/2007
Field of study

CERN Document Server